-
Notifications
You must be signed in to change notification settings - Fork 13.9k
[FLINK-38857][Model] Introduce a Triton inference module under flink-models #27385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
d579334 to
6b3d20d
Compare
...ink-model-triton/src/main/java/org/apache/flink/model/triton/TritonModelProviderFactory.java
Outdated
Show resolved
Hide resolved
...nk-model-triton/src/main/java/org/apache/flink/model/triton/AbstractTritonModelFunction.java
Outdated
Show resolved
Hide resolved
...nk-model-triton/src/main/java/org/apache/flink/model/triton/AbstractTritonModelFunction.java
Show resolved
Hide resolved
...k-model-triton/src/main/java/org/apache/flink/model/triton/TritonInferenceModelFunction.java
Outdated
Show resolved
Hide resolved
...k-model-triton/src/main/java/org/apache/flink/model/triton/TritonInferenceModelFunction.java
Outdated
Show resolved
Hide resolved
...nk-model-triton/src/main/java/org/apache/flink/model/triton/AbstractTritonModelFunction.java
Outdated
Show resolved
Hide resolved
flink-models/flink-model-triton/src/main/java/org/apache/flink/model/triton/TritonUtils.java
Outdated
Show resolved
Hide resolved
flink-models/flink-model-triton/src/main/java/org/apache/flink/model/triton/TritonUtils.java
Outdated
Show resolved
Hide resolved
...nk-model-triton/src/main/java/org/apache/flink/model/triton/AbstractTritonModelFunction.java
Outdated
Show resolved
Hide resolved
...nk-model-triton/src/main/java/org/apache/flink/model/triton/AbstractTritonModelFunction.java
Outdated
Show resolved
Hide resolved
| .withDescription( | ||
| Description.builder() | ||
| .text( | ||
| "Reserved for future use (v2+). Currently has NO effect in v1. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious - what is the benefit of adding something we do not use yet? It would seem simpler to add it when we use it.
| public static final ConfigOption<Long> TIMEOUT = | ||
| ConfigOptions.key("timeout") | ||
| .longType() | ||
| .defaultValue(30000L) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should use the Duration type for this
| Description.builder() | ||
| .text( | ||
| "Full URL of the Triton Inference Server endpoint, e.g., %s", | ||
| code("http://localhost:8000/v2/models")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
worrying that the example is http i.e. unsecure.
| .booleanType() | ||
| .defaultValue(false) | ||
| .withDescription( | ||
| "Whether this is the start of a sequence for stateful models."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be useful to have a reference as to what a sequence is in this context
| ConfigOptions.key("compression") | ||
| .stringType() | ||
| .noDefaultValue() | ||
| .withDescription("Compression algorithm to use (e.g., 'gzip')."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we give a reference or list the valid values for this?
|
|
||
| public static final ConfigOption<String> CUSTOM_HEADERS = | ||
| ConfigOptions.key("custom-headers") | ||
| .stringType() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Headers are MAP <STRING, ARRAY>. Can we avoid using json and use the standard Flink map type passing lists of strings in the way that list types would expect?
Purpose of this change
This PR introduces an optional Triton-based inference module under
flink-models, enabling Apache Flink to invoke NVIDIA Triton Inference Server for model inference.The integration is implemented at the runtime layer via the existing model provider SPI, allowing users to define Triton-backed models using
CREATE MODELand execute inference throughML_PREDICT, without requiring any changes to the planner, runtime semantics, or SQL behavior.This module is designed as a reusable and extensible foundation for integrating external inference services into Flink’s model inference framework.
Summary of changes
flink-model-tritonmodule underflink-modelsVerification
Impact assessment
This change is fully optional and isolated under
flink-models.Documentation
docs/connectors/models/triton.mdRelated issues